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Abstract — We review connections between coding-theoretic 
objects and sparse learning problems. In particular, we show 
how seemingly different combinatorial objects such as error- 
correcting codes, combinatorial designs, spherical codes, com- 
pressed sensing matrices and group testing designs can be 
obtained from one another. The reductions enable one to translate 
upper and lower bounds on the parameters attainable by one 
object to another. We survey some of the well-known reductions 
in a unified presentation, and bring some existing gaps to 
attention. New reductions are also introduced; in particular, we 
bring up the notion of minimum L-wise distance of codes and 
show that this notion closely captures the combinatorial structure 
of RIP-2 matrices. Moreover, we show how this weaker variation 
of the minimum distance is related to combinatorial list-decoding 
properties of codes. 

I. Introduction 

Consider an n-dimensional vector x € that is L- 
sparse, i.e., has L or less non-zero entries. The basic goal 
in compressed sensing is to design a measurement matrix 
M £ <C nxN such that from the measurement outcome 

y := M ■ x e C™ 

it is information-theoretically possible to uniquely reconstruct 
x. Since x can be described by up to L complex numbers 
plus L integers in [N] := {1,...,N} (that describe the 
support of the vector), it is natural to expect that the amount 
of measurements n can be made substantially less than the 
dimension N of the vector, even if one uses a set of linear 
forms as above to encode x. It turns out that the above intuition 
can be formalized and indeed there are measurement matrices 
with significantly smaller number of rows than columns (TJ, 
ED- 0, flU, 0. In fact one can even obtain n = 2L by taking 
M to be a Vandermonde matrix |6). 

Similar to compressed sensing, one can think of different 
sparse recovery problems with the goal of identifying objects 
that are known to have sparse representations. For example, 
compressed sensing can be extended to vectors over finite 
fields, which makes it essentially equivalent to the well- 
studied syndrome decoding problem of error-correcting codes, 
or to non-linear measurements. A particularly interesting class 
of non-linear measurements is characterized by disjunctions, 
which gives rise to a class of sparse recovery problems known 
as (non-adaptive) combinatorial group testing (cf. [7], |8|). In 
group testing, the measurement matrix and the sparse vector 
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x both lie in the Boolean domain {0, 1}. Then, the zth entry 
of the measurement y is defined as the logical expression 

y(i) := (Mi,! A xi) V (M i<2 Ax 2 ) V • • • V (M i<n A x n ), 

where Mij denotes the jth entry of the ith row of M. Same 
as compressed sensing, group testing measurement matrices 
are known for n <C N, 

Even though we have defined the sparse recovery problems 
above in the most basic combinatorial form, in practice it is 
desirable to have measurement matrices with further qualities. 
For example, it is desirable to have an explicit construction of 
the measurement matrix; e.g., a polynomial-time algorithm for 
computing the entries of the matrix. Moreover, the decoding 
algorithm to infer the sparse vector from the measurement 
outcomes is of crucial importance and it is desirable to have 
as efficient a decoder as possible. Third, imprecisions are 
inevitable in practice and the design should be robust in 
presence of errors. 

Going through the vast amount of literature in sparse recov- 
ery makes it evident that the theory of error-correcting codes 
proves to be of central importance in addressing the three basic 
requirements above. In this work, we revisit and highlight 
some of the known connections between coding theory and 
sparse recovery in a unified exposition, and moreover we 
introduce new connections. In particular, we study connec- 
tions between coding-theoretic objects such as codes with 
large distance, list-decodable codes, combinatorial designs, 
and spherical codes to sparse recovery problems. 

Coding theoretic methods have also been successfully ap- 
plied to other sparse recovery problems, such as extensions 
of group testing to the threshold model and learning sparse 
hypergraphs, as well as low-rank matrix completion problems. 
However, due to the space limit, in this presentation we will 
only focus on the basic problems of compressed sensing and 
(noiseless) group testing. Moreover, we will only be able to 
emphasize on a few of the most basic reductions from coding- 
theoretic objects to measurement designs, and vice versa. 

The rest of the paper is organized as follows. In Section iFAl 
we review the notation that we use throughout the paper. 
Then, in Section [TT] we introduce the notions of Restricted 
Isometry Property (RIP) and disjunct matrices that are central 
to compressed sensing and group testing, respectively. Sec- 
tion Hn] shows how the minimum distance of error-correcting 
codes relate to the RIP. Section [TV] introduces the new 
idea of extending the notion of the minimum distance of 
codes to tuples of codewords, as opposed to pairs. Then, we 



show a new result that this notion is more closely related 
to the RIP than the minimum distance. Section [V] shows 
the relationship between codes, combinatorial designs, and 
group testing schemes. Section [VI] touches upon some new 
connections between RIP matrices and list-decodable codes. 
Finally, Section IVHI concludes the work with possible future 
directions. 

A. Notation 

For a vector v = (ui, . . . , v n ), we use the convention 
v(i) :— Vi for the ith entry of v and define supp(u) C [n] 
to denote the support of v. For an n x N matrix M, and 
a subset of column indices C C [N], the submatrix of M 
obtained by removing all columns of M outside C is denoted 
by M\c- For a complex vector v, the £ p norm of v is denoted 
by 1 1 1>|| p . When p = 2, we may omit the subscript and simply 
write || w|| . For a complex number a £ C, the conjugate of a is 
denoted by a*. For the most part in this write-up, we assume 
without loss of generality that g-ary codes are defined over the 
alphabet Z g even if we do not use the ring structure of Z 9 . 
For Boolean vectors x and y, we use A(x,y) to denote the 
Hamming distance between x and y. 

The statistical distance between two distributions X and y 
with probability measures Prvt(-) and Pry(-) defined on the 
same finite space £ is given by | 2~^ses I ^ r x{s) — Pry(s)|, 
which is half the t\ distance of the two distributions when 
regarded as vectors of probabilities over £. Two distributions 
X and y are said to be e-close if their statistical distance is 
at most e. 

II. Combinatorics of Sparse Recovery 

It is easy to see that for the purpose of compressed sensing, 
a measurement matrix M can distinguish between all L-sparse 
vectors iff for every subset £ of up to 2L columns, the right 
kernel of the sub-matrix M\c is zero. This condition is in 
particular achieved by Vandermonde matrices |6|. However, 
in general such matrices need not be well-conditioned in the 
sense that the action of the matrix on sparse vectors may 
greatly affect their norm, which is not desirable in presence 
of imprecisions and/or noise in the measurements. A stronger 
condition would be to require each sub-matrix M\c to be 
nearly orthogonal. This gives rise to the notion of Restricted 
Isometry Property (RIP) as defined below. 

Definition 1. Let p,a > be real parameters. An n x N 
matrix M £ <[^nxN ^ sa ^ to sa t( s jy RlP-p of order L with 
constant a (or said to have L-RIP-p, in short) if for every 
£ Q [N] with \C\ < L and every column vector x £ C' £ ', we 
have (1 — a)\\x\\ p < \\M\c'X\\ p < (l + a)||a;||p. The constant 
a is sometimes omitted, in which case it is implicitly assumed 
to be an absolute constant in (0, 1). 

In this work, we will focus on the special case p = 2. 
In this case, it is known that an RIP matrix is sufficient 
for distinguishing between sparse vectors even in presence of 
noise and when the vector being measured is approximately 
sparse (cf. |9), ED, CD)- Moreover, a linear program can be 



used to reconstruct the sparse vector. Similar (but weaker) 
results are known about the RIP-1 (cf. ifTTl ). 

For group testing the following basic notion turns out to 
exactly capture the combinatorial structure needed for distin- 
guishing between L-sparse vectors (cf. J7J): 

Definition 2. An n x N binary matrix is called L-disjunct 
for any choice of L + 1 columns Mq, . . . , Ml of the matrix, 
we have \J ie[L] supp(A7i) supp(M ). 

III. From Minimum Distance to RIP 

In this section we describe a few well known results about 
construction of RIP matrices from codes with good minimum 
distance properties. These techniques are used, for example, in 
1 12 1, iTHl . Ifl4l for deterministic construction of RIP matrices 
from specific families of codes. The reductions are based on 
the following simple embeddings of finite-domain vectors into 
the complex domain: 

Definition 3. Let c £ Z™ be a q-ary vector. 

1) Let £ 6 C be a primitive qth root of unity. The spherical 
embedding of c, denoted by Sph(c), is a vector d G C" 
where for each i £ [n], we define c'(i) := C c(l) /\/«- 

2) For any i £ Ti q , denote by the ith standard basis 
vector in {0, l} 9 . That is, ei(j) = lifj = i+l and 
e i(j) — if j 7^ i + 1- The Boolean embedding of c, 
denoted by Bool(c), is a vector c" £ {0, l} qn obtained 
from c by replacing each element c(i) of c with the q- 
dimensional vector e c uy 

For example, consider the 4-dimensional binary vector c := 
(0, 1, 1, 0) £ F|. Then, we have C = -1 and 

Sph(c) = (1,-1,-1,1) 
Bool(c) = (1,0,0,1,0,1,1,0). 

The property that is later needed for the RIP constructions 
is the bias of the code, defined below. 

Definition 4. A vector c £ Z™ naturally induces a probability 
measure /i c on the alphabet 7L q , where for each i £ Z g , fJ. c (i) 
is the fraction of coordinate positions at which c is equal to i. 
The vector c is said to be e-biased if fi c is e-close to uniform. 

Definition 5. A code C C Z g is said to be e-biased if, for every 
pair of distinct codewords c, c' £ C, the difference vector c—c' 
is e-biased. 

Even though small bias is in general stronger than large 
minimum distance, for balanced codes as defined below the 
two notions are essentially equivalent, up to simple manipu- 
lations of the code. 

Definition 6. A (possibly non-linear) code C C F™ is called 
balanced if, for every c £ C, and every a £ W q , c + al £ C, 
where 1 denotes the all-ones vector. 

Definition 7. Let C C Z™ be a balanced code. Consider 
the equivalence relation between codewords that differ by a 
multiple ofl := (1, . . . , 1). This partitions the codewords of C 



into equivalence classes. Define C/l to be any sub-code of C 
that picks exactly one codeword from each equivalence class. 

Proposition 8. Let C C Z™ be a balanced code with relative 
minimum distance at least 1 — (1 + e)/q. Then the sub-code 
C/l is e-biased. 

Proof: Consider any pair of distinct codewords c, d £ 
C/l and define C" := {d + al: a £ ZJ. Since C is 
balanced, C C C. Moreover, c ^ C , and therefore, the 
relative Hamming distance between c and any codeword in C 
is at least 1 — (1 + e)/q. In particular, the fraction of position at 
which c — d is equal to any value a £ 7L q is at most (1 + e)/q 
(since otherwise, the distinct vectors c and d — al would agree 
at more than (1 + e)/q fraction of the positions, violating the 
minimum distance property). From the definition of statistical 
distance, we conclude that c — d is e-biased. ■ 
Now we are ready to describe how small bias is related to 
geometric properties of the complex embeddings in Defini- 
tion |3] 

Proposition 9. Suppose c, d £ Z™ are so that c — d is e- 
biased. Then | (Sph(c), Sph(c')) | < 2e. 

Proof: For i £ Z ? , let p. t := \{j : c(j) - d(j) = i}\/n. 
We know that the values pt induce a probability distribution 
on Z g that is e-close to uniform. Define s\ :— Sph(c) and 
S2 := Sph(c'). We have that 
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where (HJ is due to the fact that J^iez — 0- ■ 

Definition 10. Let C C Z™ be a code. 

1) The spherical embedding ofC is a complex nx \C \ matrix 
with columns indexed by the elements of C. The column 
corresponding to a codeword c £ C is Sph(c). 

2) The Boolean embedding of C is a real n x \C \ matrix 
with 0/1 entries and columns indexed by the elements 
of C. The column corresponding to a codeword c £ C is 
Bool(c). 

Definition 11. A set C £ C" is a spherical cod^H if each c £ C 
satisfies \\c\\ — 1. Moreover, C is said to be e-coherent if for 
any distinct c, d £ C, we have \ (c, d) \ < e. 

Using the above definition, Proposition [9] immediately im- 
plies the following. 

'Traditionally spherical codes are defined under the constraint of having 
upper bounded (but possibly negative) mutual inner products. In this work 
we will require them to have low coherence, which is a stronger property. 



Coroilary 12. Let C C Z^ 1 be an e-biased code. Then the 
column set of Sph(C) forms a spherical code with coherence 
at most 2e. 

We are now ready to state how low-coherent spherical codes 
are related to RIP matrices. This is shown in the following 
well-known proposition: 

Proposition 13. Suppose that the column set of an n x N 
complex matrix M form an e-coherent spherical code. Then, 
M satisfies RIP-2 of order L with constant Le. 

Proof: Consider annxL sub-matrix ill 7 of M where 
M' := (Mi | • • ■ | M' L ) and the M[ are unit vectors in C™ 
and let x = (x\, . . . , xl) £ C L . We can write 

\\M'x\\ 2 - \\x\\ 2 = (M'x,M'x) - \\xf 



ie[L] 

ie[L] 



ie[L] 



.<7||.U,||- - J2 x<X3( M U M 'i)-M\ 
ije[L] 



x l x J (M-, M'j) =: r). 



And now we have 



\V\ < e\Y x i x i\ ^ < E X V < 441 < Le\\x\\ 2 2 , 
ie[i] 



i,3 



where the last inequality is by Cauchy-Schwarz. ■ 
The above proposition can be combined with Proposition [8] 
and Corollary [12] to show the following result. 

Coroliary 14. Let C C Z™ be a balanced code with relative 
minimum distance at least 1 — (1 + e)/q. Then, Sph(C/l) 
satisfies RIP-2 of order L with constant 2Le. 

As for Boolean embedding Bool(-), the following observa- 
tion is easy to verify: 

Proposition 15. Let C C Z™ be a code with relative minimum 
distance at least 5. Then, columns of Bool (C)/y / n form a 
(1 — S)-coherent spherical code. 

Combined with Proposition Qj] we see that Boolean em- 
bedding can also result in RIP matrices. 

Coroliary 16. Let C C Z™ be a code with relative minimum 
distance at least 1 — (1 + e)/q. Then, Boo\(C)/^/n satisfies 
RIP-2 of order L with constant (1 + e)L/q. 

Now we consider instantiations of the above result with 
asymptotically good families of codes. Various positive and 
negative bounds are known for rate-distance trade-offs achiev- 
able by error-correcting codes. On the positive side, the 
Gilbert-Varshamov bound on codes lfT31 . lfl6l states that for 
every alphabet size q > 1 and constant 8 £ [0, 1 — 1/q), there 
are g-qry codes with rate 



R> 1 - h q {8) 



oil), 



(2) 



where h q (-) is the g-ary entropy function defined as 

h q (S) :=S\og q (q - 1) -Slog q (6) - (1 - <5)log g (l - 5). 

This bound is achieved by a random linear code (assuming 
a prime power alphabet size) with overwhelming probability, 
and one can also make sure that the code is balanced, by 
forcing the all-ones word to be in the code. When 5 = 1 — 
(1 + e)/q, the bound 1 — h q (S) becomes 

(3) 

Now let us instantiate the above results with a balanced q- 
ary code C C Z™ on the Gilbert- Varshamov bound and with 
relative minimum distance 1 — (1 + e)/q. First, consider the 
spherical encoding Sph(C/l) and suppose that we wish to 
obtain an n x N RIP-2 matrix of order L with a fixed constant 
a. In order to apply Corollary [141 we need to set e = a/(2L). 
In this case, the Gilbert- Varshamov bound implies that the rate 
R of C can be at least n(e 2 /(q log q)) = il(a 2 / (L 2 q log q)). 
The number of columns of the resulting matrix is TV = q Rn ~ l . 
Therefore, we have 

log TV = (ito-l)logg = Q(a2n/(L 2 q)), 

or in other words, 

n = 0(L 2 (log N)q/ a 2 ) = O a , q (L 2 log N). (4) 

We remark that Porat and Rothschild [17| show how to de- 
randomize the probabilistic construction of linear codes on the 
Gilbert- Varshamov bound for any fixed prime power alphabet 
q. They design a deterministic algorithm for constructing the 
generator matrix of the code in time 0(nq Rn ), where R is the 
ratfl This running time is in nearly linear in the number of 
the entries of the resulting RIP matrix. 

It is well known that there are RIP-2 matrices of order L 
with n = 0(Llog(N / L)) rows and this bound is achieved by 
several probabilistic constructions (in particular, independent 
Bernoulli ±l/y/n entries) [18], [19|- However we see that 
even using codes on the Gilbert- Varshamov bound the number 
of rows of the RIP matrix obtained from Corollary [l4lbecomes 
larger by a multiplicative factor of about Q(L). To see whether 
this can be improved, we consider negative bounds on the rate- 
distance trade-offs of codes. 

For our range of parameters, the best known negative 
bounds on the rate-distance of error-correcting codes (that 
show upper bounds on the rate of any code with a certain min- 
imum distance) are given by linear-programming techniques. 
In particular, the linear programming bound due to McEliece, 
Rodemich, Rumsey, and Welch (cf. [20 Chapter 5]) states 
that, asymptotically, any binary code with relative minimum 
distance at least 5 and rate R must satisfy 



This bound can be generalized to g-ary codes as follows (see 

eh). 



R<h, 



(-(? ~ 1 - (q ~ 2)6 - 2^(q - 1)5(1 - 6))) +o(l). 

(5) 

For any fixed q, and for 5 = 1 — (l + e)/q, this bound simplifies 
to R = 0(e 2 log(l/e)). Using simple calculations as before, 
we conclude that the RIP-2 matrix construction of Corollary [T4l 
always requires n = il(L 2 (log N) / log L) rows, regardless of 
the code being used. 

The RIP matrices constructed from Corollary [14] require a 
factor ft(L 2 ) in the number of rows due to the fact that their 
column set forms a spherical code. It is known that any e- 
coherent spherical code of size N over C ra must satisfy the 
following (cf. Il22l ) 



e 2 = 



logiV 



(6) 



R < h(l/2 - ^6(1-6)) + o(l). 



■ The algorithm can be adapted to ensure that the obtained code is balanced. 



nlog(n./ log N) 

which implies n — £l((log N) / (e 2 log(l/e))). Therefore, the 
factor e 2 in the denominator of the bound on n (which 
translates to a factor L 2 in the RIP setting) is necessary. 

On the positive side, the reduction above from the codes 
on the Gilbert- Varshamov bound indirectly shows that spher- 
ical codes with coherence e = 0((logiV)/n) (i.e., n = 
0(e 2 logN)) exist and can be attained using probabilistic 
constructions. On the negative side, the lower bound (|6]l can 
be translated (using the reduction from error-correcting codes 
to spherical codes) to upper bounds on the attainable rates 
of g-ary codes with distance close to 1 — 1/q. This results 
in an indirect upper bound comparable to what the linear 
programming bound (0 implies. 

Now we turn to the construction of RIP matrices from 
the Boolean embedding of error-correcting codes obtained in 
Corollary [16] In order to obtain an RIP-2 matrix of order L 
with constant a, by Corollary [16] it suffices to have a code 
C C Z™ attaining the Gilbert- Varshamov bound with relative 
minimum distance at least 1 — (1 + e)/q and e < (aq/L) — 1. 
For a fixed constant a, we can set q = O(L) large enough 
(e.g., q — 2L/a) and choose e to be a small absolute constant 
(e.g., e = .01) so that the above condition is satisfied. The 
resulting matrix would have N := \C\ columns and n' := nq 
rows, with entries that are either or 1/y/n. Moreover, the 
matrix is rather sparse in that all but a 1/q fraction of the 
entries are zeros. 

Now, the Gilbert- Varshamov bound (f2]) implies that the rate 
R of C can be made at least SI (e 2 / (qlogq)) — £7(1/ (g log g)). 
Thus we have 

log TV = log \C\ = (Rn'/q) logg = i7(n'/g 2 ) 

which gives n' = 0(q 2 log N) = 0(L 2 log N). This is 
comparable to the bound (0]l that we obtained from spherical 
embedding of codes. Similar to the case of spherical codes, 
Boolean embedding allows us to translate positive bounds 
on the rate-distance trade-off of codes (e.g., the Gilbert- 
Varshamov bound) to upper bounds on the coherence of 



spherical codes as well as upper bounds on the number of rows 
of RIP-2 matrices. Conversely, through Boolean embedding, 
lower bounds on the coherence of spherical codes and lower 
bounds on the number of rows of RIP-2 matrices translate into 
impossibility bounds on the rate-distance trade-off of error- 
correcting codes, the former being comparable to the linear 
programming bound © when the relative minimum distance 
is around 1 — 1/q, but the latter is much weaker (namely, 
comparable to the Plotkin bound on codes l20l Chapter 5] 
which is, over small alphabets, much weaker than the linear 
programming bounds). 

IV. From Average Distance to RIP 

As we saw in the previous section, the quadratic dependence 
on sparsity L is unavoidable when the column set of an RIP 
matrix forms a low-coherence spherical code. In this section 
we introduce the notion of L-wise distance that turns out to 
be more closely related to the RIP. 

Definition 17. Let ci , . . . , Cf, € Z™ be L vectors. The average 
distance of C\, . . . ,Cl is defined in the natural way 



dist L (ci,...,c L ) 



1 



E A ( c ^) 

l<i<j<L 



where A(ci,Cj) is the Hamming distance between Ci and Cj. 

Definition 18. Let C C Z™ be a code, and L be an integer 
where 1 < L < \C\. Define the L-wise distance of C as 

disti(C) := min distj,(ci, . . . , c L ). 

{ci,...,cz,}CC 

The special case L = 2 is equal to the minimum relative 
distance of the code. For the other extreme case, L = \C\, the 
L-wise distance of the code is the average relative distance 
over all codeword pairs. For linear codes, this quantity is the 
expected relative weight of a random codeword, given by 

.. . (r s (g-l)|{*€ N: (3(ci,...,c n )€C),c i ^0}| 

dlst |c|(<-J = • 

qn 

Thus, as long as the code is non-constant at all positions, its 
|C|-wise distance is equal to (1 — 1/q). Also, a simple exercise 
shows the "monotonicity property" that for any code C, and 
L' > L, distL/(C) > dist L (C). 

We will use the notion of flat RIP below from [23 1 . 

Definition 19. Let a > be a real parameter. An n x N 
matrix M € <C nxN with columns M\,...Mn G C™ is said to 
satisfy flat RIP of order L with constant a if for all i £ [N], 
\\Mi\\ — 1 and moreover, for any disjoint L\,Li C [N] with 
|£i| = \Li\ < L we have 




<a y /\C 1 \\C 2 \=a\£ 1 \ 



The original definition of flat RIP in |23| is stronger and 
does not assume the two sets \Ci\ and \C%\ have equal sizes. 
However, adding the extra constraint does not affect the result 
that we use from their work (Lemma |2TI below). 



A straightforward exercise shows that the standard RIP-2 is 
no weaker than the flat RIP, namely, 

Proposition 20. Suppose a matrix M satisfies RIP-2 of order 
2L with constant a. Then, M satisfies flat RIP of order L with 
constant O(a). 

More interestingly, the two notions turn out to be essentially 
equivalent (up to a logarithmic loss in the RIP constant) in 
light of the following result by Bourgain et al.: 

Lemma 21. $23$ Let L > 2 10 and suppose that a matrix M 
satisfies flat RIP of order L with constant a. Then M satisfies 
RIP-2 of order 2L with constant 44a log L. 

The notion of L-wise distance is a relaxed variation of the 
minimum distance, where the distance is averaged over various 
choices of L distinct codewords, as opposed to only two. 
Similarly, the notions of e-biased codes and spherical codes 
can be relaxed to L-wise forms and one can obtain various 
generalizations of the results presented in Section [Til] to codes 
satisfying the relaxed notion of L-wise distance. 

For clarity of presentation, for the remainder of this section 
we only focus on binary codes. In this case, if the code C 
with L-wise distance at least 1/2 — e contains the all-ones 
word, one can simply show that not only the average distance 
of any choice of L codewords in C/l is at least 1/2 — e, 
but this quantity is also no more than 1/2 + e (to see this, it 
suffices to note that the average distance of L codewords plus 
the average distance of their negations equals one). Let us call 
codes satisfying this stronger property L-wise e-biased: 

Definition 22. Let C C Z^ be a code, and L be an integer 
where 1 < L < \C\. Then, C is called L-wise e-biased if 



max |disti(ci,. 

,...,ct}ce 



■ ,c L )-l/2\<e. 



The result below shows how the flat RIP and L-wise dis- 
tance are related. Again, the result is only presented for binary 
codes and the extension to g-ary codes is straightforward. 

Lemma 23. Suppose C C Ij^ such that, for a positive 
integer Lq and all L < 2Lq, C is L-wise (a/ L)-biased. Then, 
Sph(C) satisfies flat RIP of order Lq with constant Aa. 

Proof: Fix any L' < 2Lo and any collection c\, .. - ,Cl/ 
of the codewords in C. Define 

T)(ci,...,c L >) := ^2 ( S P n ( c i)>Sph(c,-)) 

l<i<j<L' 

E (2A(c i ,c j )/n-l) 

l<i<j<L' 

€ [-aL',+aL'}, (8) 

where (Q is due to the small-bias assumption on C. 

Now, let L < Lq and Mi , . . . , M 2 l be distinct columns of 
Sph(C/l) corresponding to distinct codewords c[, . . . , c' 2L in 



C. Now, from Definition [19] we need to bound the quantity 

4 ■■= (e m - E m ) 

\iE[L] L<i<2L I 

E ( M i> M i)- E ( M i> M j)- 

\<i<j<1L \<i<j<L 

E ( M i> M j)- 

L + l<i<j<2L 

Now, the absolute value of rf can be bounded as 

W\ < \v(ci,---,c 2L )\ + |r?(ci, 



(*) 



|ry(ci+i, . . . ,c 2 l)| < 4<xL, 

where (★) is from ©. ■ 
Note that, contrary to Corollary [141 the above result does 

not require the code to have an extremal minimum distance. 

In principle, C can have a minimum distance bounded away 

from 1/2 by a constant (depending on the constant a) and still 

satisfy the conditions of Lemma |231 

The above result is also valid in the reverse direction, as 

follows. 

Lemma 24. Let M be an n x N matrix with entries in 
{ — 1/^/n, +l/y / n} satisfying the flat RIP of order La with 
constant a. Then, columns of M form the spherical encoding 
of a code C C 1^ such that for any L < La, the code C is 
L-wise 0(a/ L)-biased. 

Proof: Assume L is even (the odd case is similar). 
Consider any L distinct columns Mi,... , Aft of M and 
observe that 



V 



E 

l<i<j<[L] 



(Mi,Mj 



E E 

£C[L] iEC 



(M u Mj)/ 



L 

& - 1 
2 1 



By the flat RIP, each term J^iec ^2je[L]\c(Mi, Mf) is upper 
bounded in absolute value by aL/2, and therefore, the above 
equation simplifies in absolute value to \n\ — 0(aL). Now 
suppose the codewords corresponding to Mi , . . . , Ml are 
ci,...,ct. The L-wise distance of these codewords can be 
written as 



distt(ci,... ,cl) 



J_ / v \ + {M i ,M jl 
(L) L 2 

\2) \l<i<j<L 



= 2^/(2 

Hence, |dist z (ci, . . . , c L ) - 1/2| = \rj\/(%) =0(a/L). ■ 
V. Designs and Disjunct Matrices 

In this section we turn to the problem of combinatorial 
group testing, and in particular discuss coding-theoretic con- 
structions of disjunct matrices. One of the foremost construc- 
tions dates back to the work of Kautz and Singleton 1241 . 



who used Reed-Solomon codes for the purpose of constructing 
disjunct matricefl This work results in a general framework 
for construction of disjunct matrices through combinatorial 
designs, which are defined as follows. 

Definition 25. An (n, n', r)-design is a set system 
Si, . . . , Sjy C [n] such that the size of each set is n' 
and for every pair i,j€ [N] (i 7^ j) we have \Si PI Sj\ < r. 

The following simple observations show that designs can be 
used to construct disjunct matrices, and can in turn be obtained 
from error-correcting codes: 

Lemma 26. Let T> = {Si, . . . , Sn} be an (n, n' , r)-design, 
and consider the binary n x N matrix M induced by T> where 
the ith column of M is supported on Si. Then, M is L-disjunct 
provided that Lr < n'. 

Proof: It suffices to observe that in Definition |2j each of 
the Mi for i G [L] contains at most r of the n! elements on 
supp(Mo). ■ 

Lemma 27. Let C = {ci, . . . ,czv} C Z™ be a code with 
minimum Hamming distance at least d. For n := n'q, consider 
the set system T> := {Si: i € [N]} defined from the Boolean 
embedding of C as follows: Si := supp(Bool(ci)). Then, T> is 
an (n,n',n' — d)-design. 

Proof: Observe that intersection size \SiP\ Sj\, for i 7^ j, 
is equal to n' — A(cj, cf) < n' — d. The rest of the conditions 
are trivial. ■ 

Now let us instantiate the above lemmas with a k- 
dimensional Reed-Solomon code, as in [24]. In this case, 
the alphabet size q can be made equal to the block length 
n' (assuming that n' is a prime power). From Lemma [27] 
the resulting (n, n', r)-design satisfies n — n' 2 , r = n' — 
(n 1 — k) — k (since the minimum distance of the code is 
n! — k + 1), and log N = k log q = r log n! > r. Furthermore, 
by Lemma [26] characteristic vectors of the resulting design 
form a disjunct matrix with sparsity parameter L ps n'/r. 
Therefore, the number of rows n can be upper bounded as 
n = n' 2 w {rLf < (LlogN) 2 . 

As a second example, consider choosing a q-ary code on the 
Gilbert- Varshamov bound with minimum Hamming distance 
at least d := n' — (1 + e)n' /q, for some small (and fixed) 
constant e > 0. Recall that the rate R of the code satisfies R = 
fl(e 2 /(qlogq)). This time, we obtain an (n, n' , r)-design with 
r = n'-d = (l + e)n'/q, n = n'q = (l + e)n' 2 /r = 0(n' 2 /r) 
and log N = Rn'logq = n(e 2 /q) = n(re 2 /(l + e)) = fi(r). 
Now lemma [26] implies that the measurement matrix that has 
the Boolean embedding of the codewords as its columns is 
L-disjunct for L ss n'/r. Note that since q — (1 + e)n'/r, we 
must choose q = f2(L) for the bounds to follow. Altogether, 
we obtain n = n'q = 0{n' 2 /r) = 0{L 2 r) = 0{L 2 log A). 

Probabilistic arguments can be used to show that (n, n', re- 
designs of size N exist for n — 0(n' 2 N x / r /r), and moreover, 

3 The work of Kautz and Singleton aims to construct superimposed codes, 
which are closely related to disjunct matrices. 



this bound is known to be nearly tight (cf. 11251 and Q Ch. 7]). 
Therefore, we see that the design obtained from codes on 
the Gilbert- Varshamov bounds for which nr/n' 2 — 0(1) and 
logiV = f2(r) essentially achieves the best possible bounds. 

Regarding the existence of disjunct matrices, it is known 
that L-disjunct matrices exists with n = 0(L 2 log N) rows 
(using the probabilistic method) and moreover, any L-disjunct 
matrix must satisfy n = fl(L 2 log £ N) (cf. @ Ch. 7]). Again, 
we see that the disjunct matrices obtained from codes on the 
Gilbert- Varshamov bounds are essentially optimal. Moreover, 
such matrices can be generated in polynomial time in the size 
of the matrix using the result of Porat and Rothschild ifTTl . 

VI. List Decoding and Sparse Recovery 

As we saw in Section [IV] the relaxed notion of L-wise 
distance essentially captures the RIP-2 for matrices with 
±l/y/n entries. In this section, we relate this notion to the 
standard notion of combinatorial list-decoding that has been 
extensively studied in the coding-theory literature. 

We remark that the notion of soft-decision list-decodable 
codes has been used for construction of RIP-1 matrices, and it 
is known that optimal RIP-1 matrices can be constructed from 
optimal soft-decision list-decodable codes which, in particular, 
imply optimal unbalanced lossless expander graphs (see [26|, 
[27 1, [11| and the references therein for the construction 
of RIP-1 matrices from expander graphs and ll28l for the 
reduction from codes to expander graphs). The goal is this 
section is to show how list-decoding is related to the more 
geometric property RIP-2. 

Definition 28. A code CCZJ is (L,p)-list decodable if for 
any x G Z™, we have \B(x, p)C\C\ < L, where B{x, p) denotes 
the Hamming ball of radius pn around x. 

In the following lemma, we show that codes with good L- 
wise distance have good list-decoding properties. 

Lemma 29. Suppose that the L-wise distance of a code C C 
Z£ is at least 1/2 - e 2 , where L = 0{l/e 2 ). Then, C is 
(0(l/e 2 ), 1/2 - e)-list decodable. 

Proof: The proof idea is inspired by a geometric proof of 
the Johnson's bound due to Guruswami and Sudan [29 1 . By 
the end of the proof, we will determine an L' = 0(l/e 2 ) 
satisfying V > L such that the assumption that C is not 
(L' , e)-list decodable leads to a contradiction. 

Now, for the sake of contradiction, consider any x G Z?, 1 
for which C n B(x, 1/2 — e) has size at least L'. Take any set 
of distinct codewords 

ci,...,cl' GCnB(x,l/2-e) 

and consider the spherical encodings vq := Sph(x), v% := 
Sph(ci), . . . , V£ := Sph(c£/). By the monotonicity property 
of the L-wise distance, we know that dist^(ci, . . . , Cf/) > 
1/2 — e 2 . For spherical embeddings, this translates to 

'L'\ 

' ](l-2dist L ,(ci,...,CL')) < 2L' 2 e 2 



Also, since the relative Hamming distance between x and any 
Cj is at most 1/2 — e, we get 

(Vi G \L'\) ( Vl ,v Q ) = (1 - 2A(ci,a)/n) > 2e. (10) 

Using ( [Tol l, for every i e [V] and parameter j3 > 0, 

( Vi -j3v ,Vi-l3v ) - l+/3 2 -2/3( Wi , Uo ) < l+/3 2 -4e/3. (11) 
Similarly, for 1 < i < j < L' we can write 

(Vi - /3v ,Vj - j3v Q ) = (vi,Vj) + /3 2 - I3(vi+Vj,v ) 



< {v uV] )+p 2 ~Up. 



(12) 



Altogether, 



l<i<j<L' 



(9) 



o < ( Yl ( v i - ^°)> E ^ - / 

\i£[L'] i£[L'} I 

ie[i'] 

l<i<j<L' 

< L'{l + (3 2 -4ef3) + 

2L' 2 e 2 + (L' 2 - L')((3 2 - 4e/3), 

where the last inequality is using (0, ([TTl i. and ( fT2l . There- 
fore, after reordering, we have L' < l/(4e/3 — (3 2 — 2e 2 ), 
provided that the denominator is positive. Now we choose 
/3 := e to get L' < 1/e 2 . Therefore, it suffices to choose 
L' > max{l/e 2 , L} to get the desired contradiction. ■ 
A sequence of results that we have seen so far can be 
combined to obtain list-decodable codes from RIP matrices. 
Namely, starting from a binary RIP matrix, we can apply 
Proposition l20l Lemma l24l and Lemma [29] in order and obtain 
the following: 

Lemma 30. Suppose an n x N matrix M with entries 
in {— 1/y/n, +1/V«} satisfies the RIP-2 of order L with 
constant a. Let C C be the binary code such that 

M = Sph(C). Then, there is a parameter eo = 0(y/ a/L) 
such that for every e > Cq, C is (0(l/e 2 ),l/2 — e)-list 
decodable. 

Recall that the probabilistic method shows that RIP-2 matri- 
ces of order L exist with N columns and n — 0(L \og(N/L)) 
rows, and this is achieved with overwhelming probability by 
a random matrix (with ±l/^/n entries). Using such a matrix 
in the above lemma, we obtain an (0(l/e 2 ),l/2 — e)-list 
decodable code with rate R = fl(e 2 ). It can be directly shown 
that this list-decoding trade-off is achieved by random codes 
with overwhelming probability, and the trade-off is essentially 
optimal (cf. 11301 ). However, explicit construction of optimal 
RIP-2 matrices and optimal binary list-decodable codes at 
radius 1/2 — e are both challenging open problems. There- 
fore, Lemma [30] relates two important explicit construction 
problems; namely, it implies a reduction from Problem [31~l to 



Problem [32] belowQ (when the latter problem is restricted to 
binary real matrices). 

Problem 31. Construct an explicit family of binary codes with 
block length n and rate R = f2(e 2 ) that are [0(1/ e ), 1/2— e)- 
list decodable. 

Problem 32. Construct an explicit family of RIP-2 matrices 
of order L with N columns and n = 0(Llog(N/L)) rows. 

In Section [III] we showed how to obtain explicit RIP-2 
matrices from spherical embedding of codes on the Gilbert- 
Varshamov bound constructed by Porat and Rothschild iTTTIl . 
This construction achieves n = 0(L 2 log N), which achieves 
the best known explicit bound for matrices with ±l/y/n 
entries^. Observe that the dependence on L is sub-optimal by a 
factor two in the exponent. As for binary list-decodable codes 
at radius close to 1/2 (and small list-size), Guruswami et al. 
construct explicit (0(l/e 2 ), 1/2 — e)-list decodable codes of 
rate R = f2(e ) ll30l . Again, the exponent of e in the rate is 
sub-optimal by a factor two. 

A natural question is whether the reduction offered by 
Lemma [30l holds in the reverse direction as well; namely, 

Question 33. Let C C be such that, for some integer L 
and every 1 < V < L, the code C is (L', 1/2 - 0(yJa/L'))- 
list decodable. Does Sph(C) satisfy RIP-2 of order Q(L) with 
constant 0(a)? 

From Lemmas [23] and [21] we know that in order to answer 
the above question in affirmative, it suffices to show a converse 
to Lemma [29] A weak converse, not strong enough for this 
purpose, is shown below. 

Lemma 34. Suppose that a code C C Z£ is (L, 1/2 — e)-list 
decodable. Then, for L' := L/e, the L '-wise distance of the 
code C C is at least 1/2 — 2e. 

Proof: The proof is, in essence, a straightforward averag- 
ing argument. Suppose, for the sake of contradiction, that there 
is a set of L' codewords whose average distance is less than 
1/2 — 2e. Denote the spherical encodings of these codewords 
by c%, . . . , cl>, each in {—1, +1}". From the definition of L'- 
wise distance (Definition [ToTl, we have 

{c llC] )> Ae( L \n>2eL' 2 n = 2L 2 n/e. (13) 

l<i<j<L' ^ ' 

Now, define v := 2^i=i c »/^' an d note tnat ^ s a refl l 
vector in [—1, +l] n . Suppose v — («!,..., v n ) and randomly 
pick a vector v — (vi, . . . , v n ) € { — 1, +1}™ with independent 
coordinates such that E[«»] = Wj. This is possible since each 
Vi is in [— 1, +1]. 

4 We remark that, for the reduction to yield explicit list-decodable codes, 
an explicit algorithm that computes the RIP matrix in polynomial time in the 
size of the matrix would not necessarily suffice. One needs the more stringent 
explicitness that requires each individual entry of the matrix to be computable 
in time poly(n). 

5 Bourgain et al. [23 1 explicitly obtain a better-than-quadratic dependence 
on L for an interesting range of parameters. However, entries of their matrices 
are powers of the primitive complex pth root of unity for a large prime p. 



Note that, by linearity of expectation, for every i we have 
E[(-D,Ci)] = (v,Ci). Again, using linearity of expectation, 

L' V L' 

E[^2(v,Ci)] = ^{v, Cj) = {v, ^2 ch) 

i—1 i—1 i—1 

^ V L' 

i=l i=l 

l<i<j<L' 

m 

> l+ALn>ALn. (14) 

Since there is a choice of the randomness that preserves 
expectation, we can ensure that there is a deterministic choice 
of v € { — 1, +1}™ that satisfies (TT4l . In the sequel, fix such a 
v. We thus have 

L 

^(w,c,)>4Ln. (15) 

i=l 

Now, ( fl~4T > implies that there must be a set S of more than 
L vectors in {ci,...,cj/} such that for every c £ S, the 
inequality (v,c) > 2en holds, since if this were not the case, 
we would have 

L 

^{v, Cj) < L'(2en) + Ln = 3Ln, 

i=l 

contradicting {151 . We conclude that the set of codewords 
corresponding to the spherical encodings in S are all (1/2— e)- 
close in Hamming distance to the binary vector represented by 
v. This contradicts the assumption that C is (L, 1/2 — e)-list 
decodable and completes the proof. ■ 

VII. Conclusion 

The reductions between coding-theoretic objects such as 
codes with large distance, incoherent spherical codes, com- 
binatorial designs and the like are not only interesting for 
constructions, but also they relate the known bounds on the 
parameters achievable by one to another. For example, due 
to the reduction from binary codes to spherical codes, any 
improved lower bound on the coherence of spherical codes 
results in an improved upper bound on the rates achievable by 
small-biased codes. Thus, it is interesting to explore further 
connections of this type. For example, whether there is a 
reduction from disjunct matrices to designs, designs to codes, 
etc. Moreover, an affirmative answer to Question [33] would 
imply that the seemingly unrelated problems of finding explicit 
RIP-2 matrices (with il/^/n entries) and explicit binary 
list-decodable code^ at radius close to 1/2 are essentially 
equivalent. In particular, optimal RIP-2 matrices would imply 
optimal binary list-decodable codes and vice versa. One can 
also ask similar questions about non-binary codes, which 

6 Note that there is no requirement on the existence of an efficient list- 
decoder for the code. Only the encoding function needs to be efficient. 



might be easier to construct, or consider related variations of 
the L-wise distanc^ 
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